The log-Gabor method: speech classification using spectrogram image analysis

نویسندگان

  • Harm Buisman
  • Eric O. Postma
چکیده

We explored the suitability of the log-Gabor method, a speech analysis method inspired by Ezzat e.a. (2007), for automatic classification of personality and likability traits in speech. The core idea underlying the log-Gabor method is to treat the spectrogram as an image of spectro-temporal information. The image is transformed into Gabor energy values using the twodimensional logarithmic Gabor transform, which is a standard feature extraction method in visual texture analysis. The aggregated energy values are mapped onto classes by means of a support vector machine (SVM). The log-Gabor method performed above baseline on the INTERSPEECH Personality and Likability Sub-Challenges Development sets and comparable to baseline for the Test sets. These results support further investigation of the log-Gabor method as a method for extracting perceptual cues from speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Environmental Sounds Spectrogram Classification using Log-Gabor Filters and Multiclass Support Vector Machines

This paper presents novel approaches for efficient feature extraction using environmental sound magnitude spectrogram. We propose approach based on the visual domain. This approach included three methods. The first method is based on extraction for each spectrogram a single log-Gabor filter followed by mutual information procedure. In the second method, the spectrogram is passed by the same ste...

متن کامل

Spectro-temporal analysis of speech using 2-d Gabor filters

We present a 2-D spectro-temporal Gabor filterbank based on the 2-D Fast Fourier Transform, and show how it may be used to analyze localized patches of a spectrogram. We argue that the 2-D Gabor filterbank has the capacity to decompose a patch into its underlying dominant spectro-temporal components, and we illustrate the response of our filterbank to different speech phenomena such as harmonic...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Multiclass Support Vector Machines for Environmental Sounds Recognition with Reassignment Method and Log-Gabor Filters

We present a robust environmental sound classification approach, based on reassignment method and logGabor filters. In this approach the reassigned spectrogram is passed through a bank of 12 log-Gabor filter concatenation applied to three spectrogram patches, and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criterion. The proposed m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012